Tutorial: Running populations with binary_c-python with a source file

This notebook will show you how to evolve a population of stars through a source file that contains a set of pre-determined systems.

To enable source file sampling we need to configure the population object with evolution_type="source_file" and we need to provide a filename that points to the source file, i.e. source_file_sampling_filename=source_file_sampling_filename

[1]:

import os

from binarycpython.utils.functions import temp_dir, output_lines
from binarycpython import Population

TMP_DIR = temp_dir("notebooks", "notebook_source_file", clean_path=True)

We will first set up a Population object with the correct configuration for source file sampling, and add some parsing function and a custom logging routine. Please note, the custom logging and parsing of the output of binary_c is currently a very simple example. Actual use-cases are more complex in their data handling. They are merely intended to support the show-case for source file sampling.

[2]:

source_file_pop = Population(
    tmp_dir=TMP_DIR,
    evolution_type="source_file",
    num_cores=1
)

[3]:

# Create custom logging statement
custom_logging_code = """
Printf("EXAMPLE_SOURCE_FILE_LOGGING %30.12e %g %g %g %d\\n",
    //
    stardata->model.time, // 1
    stardata->star[0].mass, // 2
    stardata->common.zero_age.mass[0], // 3
    stardata->model.probability, // 4
    stardata->star[0].stellar_type // 5
);
"""

source_file_pop.set(
    C_logging_code=custom_logging_code
)

[4]:

def parse_function(self, output):
    """
    Example parse function
    """

    parameters = ["time", "mass", "zams_mass", "probability", "stellar_type"]

    # Go over the output.
    for line in output_lines(output):
        headerline = line.split()[0]

        # CHeck the header and act accordingly
        if headerline == "EXAMPLE_SOURCE_FILE_LOGGING":
            values = line.split()[1:]

            # Check if the length matches the expected length
            if not len(parameters) == len(values):
                print("Number of column names isnt equal to number of columns")
                raise ValueError

            # print some info
            value_dict = {key: float(value) for key, value in zip(parameters, values)}

    # To prevent filling the notebook with each timestep, lets just print one thing at the end. The purpose of this example is to show how things work.
    print(value_dict)


# Add the parsing function
source_file_pop.set(
    parse_function=parse_function,
)

File content/format

The sampling from source file method allows for two different types of files. The choice for the type of file is controlled via the option source_file_sampling_type, and the options are command and column

command based

This is type of source file should contain lines that are like command line commands for binary_c, i.e. a sequence of key + value pairs, optionally prepended by binary_c, e.g.:

binary_c M_1 10 M_2 5 orbital_period 10000
binary_c M_1 1 M_2 0.5 orbital_period 1000 metallicity 0.001

[5]:

# create an example source file with systems.
example_command_based_sourcefile = os.path.join(TMP_DIR, 'example_command_based_sourcefile.txt')

with open(example_command_based_sourcefile, 'w') as f:
    f.write("""binary_c M_1 10 M_2 5 orbital_period 10000
binary_c M_1 1 M_2 0.5 orbital_period 1000 metallicity 0.001""")

# Run the population
source_file_pop.set(
    source_file_sampling_filename=example_command_based_sourcefile,
    source_file_sampling_type='command'
)

# evolve population
source_file_pop.evolve()

setting up the system_queue_filler now
Loading source file from /tmp/binary_c_python-david/notebooks/notebook_source_file/example_command_based_sourcefile.txt
Source file loaded
Signalling processes to stop
{'time': 15000.0, 'mass': 1.33478, 'zams_mass': 10.0, 'probability': 1.0, 'stellar_type': 13.0}
{'time': 15000.0, 'mass': 0.602425, 'zams_mass': 1.0, 'probability': 1.0, 'stellar_type': 11.0}

****************************************************
*                Process 0 finished:               *
*  generator started at 2023-05-18T17:57:54.917371 *
* generator finished at 2023-05-18T17:57:55.331367 *
*                   total: 0.41s                   *
*           of which 0.36s with binary_c           *
*                   Ran 2 systems                  *
*           with a total probability of 2          *
*         This thread had 0 failing systems        *
*       with a total failed probability of 0       *
*   Skipped a total of 0 zero-probability systems  *
*                                                  *
****************************************************


**********************************************************
*  Population-5407e244c9ee45b2848f4cb2dce32b03 finished! *
*               The total probability is 2.              *
*  It took a total of 0.80s to run 2 systems on 1 cores  *
*                   = 0.80s of CPU time.                 *
*              Maximum memory use 170.023 MB             *
**********************************************************

No failed systems were found in this run.

[5]:

{'population_id': '5407e244c9ee45b2848f4cb2dce32b03',
 'evolution_type': 'source_file',
 'failed_count': 0,
 'failed_prob': 0,
 'failed_systems_error_codes': [],
 'errors_exceeded': False,
 'errors_found': False,
 'total_probability': 2,
 'total_count': 2,
 'start_timestamp': 1684429074.8716478,
 'end_timestamp': 1684429075.6765742,
 'time_elapsed': 0.8049263954162598,
 'total_mass_run': 16.5,
 'total_probability_weighted_mass_run': 16.5,
 'zero_prob_stars_skipped': 0}

Alright. That worked well! Please note that some of the analytics dict output is not valid/appropriate here (e.g. total_probability_weighted_mass_run) because we do not use actual probability distribution functions.

Let’s try the column based sampling next.

Column based

This type of source file should start with a header line that indicates which parameter is stored in which header. The subsequent lines should only contain the values of the corresponding parameters, e.g.:

M_1 M_2 orbital_period
10 5 1
1 0.5 1000

[6]:

# create an example source file with systems.
example_command_based_sourcefile = os.path.join(TMP_DIR, 'example_command_based_sourcefile.txt')

with open(example_command_based_sourcefile, 'w') as f:
    f.write("""M_1 M_2 orbital_period
2 1 1
0.7 0.5 1000""")

# Run the population
source_file_pop.set(
    source_file_sampling_filename=example_command_based_sourcefile,
    source_file_sampling_type='column'
)

# evolve population
source_file_pop.evolve()

setting up the system_queue_filler now
Loading source file from /tmp/binary_c_python-david/notebooks/notebook_source_file/example_command_based_sourcefile.txt
Source file loaded
Signalling processes to stop
{'time': 15000.0, 'mass': 0.680111, 'zams_mass': 2.0, 'probability': 1.0, 'stellar_type': 11.0}
{'time': 15000.0, 'mass': 0.7, 'zams_mass': 0.7, 'probability': 1.0, 'stellar_type': 0.0}

****************************************************
*                Process 0 finished:               *
*  generator started at 2023-05-18T17:57:55.737947 *
* generator finished at 2023-05-18T17:57:56.064031 *
*                   total: 0.33s                   *
*           of which 0.27s with binary_c           *
*                   Ran 2 systems                  *
*           with a total probability of 2          *
*         This thread had 0 failing systems        *
*       with a total failed probability of 0       *
*   Skipped a total of 0 zero-probability systems  *
*                                                  *
****************************************************


**********************************************************
*  Population-680d1fe0533f4e6f9ce1ea5794b07e9b finished! *
*               The total probability is 2.              *
*  It took a total of 0.72s to run 2 systems on 1 cores  *
*                   = 0.72s of CPU time.                 *
*              Maximum memory use 172.641 MB             *
**********************************************************

No failed systems were found in this run.

[6]:

{'population_id': '680d1fe0533f4e6f9ce1ea5794b07e9b',
 'evolution_type': 'source_file',
 'failed_count': 0,
 'failed_prob': 0,
 'failed_systems_error_codes': [],
 'errors_exceeded': False,
 'errors_found': False,
 'total_probability': 2,
 'total_count': 2,
 'start_timestamp': 1684429075.7047536,
 'end_timestamp': 1684429076.4274688,
 'time_elapsed': 0.7227151393890381,
 'total_mass_run': 4.2,
 'total_probability_weighted_mass_run': 4.2,
 'zero_prob_stars_skipped': 0}

That works fine too!